STARS - 2015 - Annual activity report

STARS

STARS - 2015

Project-Team Stars

Members

Overall Objectives

Presentation

Research Program

Application Domains

Highlights of the Year

New Software and Platforms

New Results

Bilateral Contracts and Grants with Industry

Bilateral Contracts with Industry

Partnerships and Cooperations

Dissemination

Bibliography

Previous |

Home | Next next

Section: New Results

Optimizing People Tracking for a Video-camera Network

Participants : Julien Badie, François Brémond.

Keywords: tracking quality estimation, error recovering, tracklet matching

This work addresses the problem of improving tracking quality during runtime. Most state-of-the-art tracking or high-level algorithms such as event recognition have difficulties to handle erroneous inputs. This framework detects and repairs detection or tracking errors. It works in an online situation and even in the case where prior knowledge of the scene (such as contextual information or training data) is not available.

The Global Tracker (figure 13 ) uses tracking results (tracklets) as input and produces corrected tracklets as output.

Figure 13. The Global Tracker framework, combining online evaluation and tracklet matching to improve tracking results.

The Global Tracker framework is divided into two main modules:

Online evaluation of tracking results: the quality of the tracking results is computed by analyzing the variation of each tracklet feature. A significant feature variation is considered as a potential error, an anomaly. To determine if this anomaly is a real error or a natural phenomenon, we use information given by the object neighborhood and the context. Finally the errors are corrected either by removing the erroneous nodes (basic approach) or by sending a signal to the tracking algorithm in order to tune its parameters for the next frames (feedback approach).
Tracklets matching over time: tracklets representing the same object are merged together in a four-step algorithm. First we select key frames (frames that are close to the mean value of the features) for each tracklet. Then a visual signature is computed based on these key frames. The distance between each pair of signature is then computed. Finally the tracklet merging is done using unsupervised learning and a constrained clustering algorithm where all tracklets representing the same object are put in the same cluster.

This approach has been tested on several datasets such as PETS2009 (table 7 ), CAVIAR (table 8 ), TUD, I-LIDS and VANAHEIM and with different kinds of scenarios (tracking associated with a controller, 3D camera, camera network with overlapping or distant cameras). In each case, we are able to reach or outperform the state-of-the-art results.

**Table 7.** Tracking results on sequence S2.L1.View1 of the PETS2009 dataset
Methods	MOTA	MOTP
Zamir et al. [95]	0.90	0.69
Milan et al. [75]	0.90	0.74
Online evaluation	0.90	0.74
Tracklet matching	0.83	0.68
Global Tracker	*0.92*	*0.76*

**Table 8.** Tracking results on the Caviar dataset
Methods	MT (%)	PT (%)	ML (%)
Li et al. [73]	84.6	14.0	1.4
Kuo et al. [70]	84.6	14.7	*0.7*
Online Evaluation	82.6	11.7	5.7
Tracklet matching	84.6	9.5	5.9
Global Tracker	*86.4*	*8.3*	5.3

This approach is described more in detail in the PhD manuscript [27] .

Previous |

Home | Next next